4 research outputs found

    An Information Retrieval Test Collection for English SMS Conversations

    Get PDF
    Information retrieval research for informal conversational settings differs in important ways from the more traditional goal of document retrieval. The goal of this research is to build an information retrieval test collection from informal conversational messages and to demonstrate the use of that collection to compare the retrieval effectiveness of some information retrieval systems. The test collection is based on the Linguistic Data Consortium's collection of more than 8,000 English SMS (Short Message Service) conversations, which contain more than 120,000 individual messages. The collection is described, followed by a description of the processes for creating and collecting topics, performing relevance judgments, and establishing baseline results. The findings indicate that traditional approaches for building information retrieval test collections can reasonably be applied to preclustered SMS conversations, but that the process of creating relevance judgments is somewhat more challenging and thus the reliable detection of differences in system effectiveness is somewhat more complex

    Using Zero-Resource Spoken Term Discovery for Ranked Retrieval

    Full text link
    Research on ranked retrieval of spoken con-tent has assumed the existence of some auto-mated (word or phonetic) transcription. Re-cently, however, methods have been demon-strated for matching spoken terms to spoken content without the need for language-tuned transcription. This paper describes the first application of such techniques to ranked re-trieval, evaluated using a newly created test collection. Both the queries and the collection to be searched are based on Gujarati produced naturally by native speakers; relevance assess-ment was performed by other native speak-ers of Gujarati. Ranked retrieval is based on fast acoustic matching that identifies a deeply nested set of matching speech regions, cou-pled with ways of combining evidence from those matching regions. Results indicate that the resulting ranked lists may be useful for some practical similarity-based ranking tasks.

    The FIRE 2013 question answering for the spoken web task. Forum for Information Retrieval Evaluation

    No full text
    ABSTRACT The FIRE 2013 Question Answering for the Spoken Web (QASW) task was an information retrieval evaluation in which the goal was to match spoken Gujarati questions to spoken Gujarati answers. This paper describes the design of the task, the development of the test collection, the runs that were submitted, and the corresponding results
    corecore